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Abstract The Bayesian framework is the standard approach for data assimilation in 
reservoir modeling. This framework consists mainly in characterizing the posterior 
distribution of geologic parameters given a prior distribution and data from the reser- 
voir dynamics. Since the posterior distribution quantifies the uncertainty in the geo- 
logic parameters of the reservoir, the characterization of the posterior is fundamen- 
tal for the optimal management of reservoirs. Unfortunately, due to the large-scale 
highly-nonlinear properties of standard reservoir models, characterizing the posterior 
is computationally prohibitive. Instead, more affordable ad hoc techniques, based on 
Gaussian approximations, are often used for characterizing the posterior distribution. 
Evaluating the performance of those Gaussian approximations is typically conducted 
by assessing their ability at reproducing the truth within the confidence interval pro- 
vided by the ad hoc technique under consideration. This has the disadvantage of 
mixing-up the approximation properties of the history matching algorithm employed 
with the information content of the particular observations used, making it hard to 
evaluate the effect of the ad hoc approximations alone. 

In this paper we propose to numerically assess the performance of standard Gaus- 
sian approximations to probe the Bayesian posterior distribution. In particular we 
assess the performance of (i) the linearization around the maximum a posterior esti- 
mate, (ii) the randomized maximum likelihood and (iii) standard ensemble Kalman 
filter-type methods. In order to fully resolve the posterior distribution we implement 
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a state-of-the art MCMC method that scales well with respect to the dimension of 
the parameter space. Our implementation of the MCMC method provides the gold 
standard against which to assess the aforementioned Gaussian approximations. We 
present numerical synthetic experiments where we quantify the capability of each 
of the ad hoc Gaussian approximation in reproducing the mean and the variance of 
the posterior distribution (characterized via MCMC) associated to a data assimila- 
tion problem. Both single-phase and two-phase (oil- water) reservoir models are con- 
sidered so that fundamental differences in the resulting forward operators are high- 
lighted. The main objective of our controlled experiments is to exhibit the substantial 
discrepancies of the approximation properties of standard ad hoc Gaussian approx- 
imations. Numerical investigations of the type we present here will lead to greater 
understanding of the cost-efficient, but ad hoc, Bayesian techniques used for data as- 
similation in petroleum reservoirs, and hence ultimately to improved techniques with 
more accurate uncertainty quantification. 

Keywords First keyword • Second keyword • More 
1 Introduction 

Simulating the dynamics of a reservoir involves solving a large-scale numerical model 
that depends on parameters related to petrophysical properties of the reservoir. These 
properties need to be known at each discretization point of the physical domain of 
the reservoir. Unfortunately, direct measurements of petrophysical properties are only 
available at a small number of locations within the domain of interest. Therefore, a 
statistical description of the reservoir parameters is required to properly account for 
the uncertainty in the petrophsyical properties caused by the lack of information. A 
prior distribution of geologically consistent reservoir parameters can be generated, 
for example, from a variogram analysis conducted on static data from core samples. 
Geostatistical techniques can then be used to generate realizations of reservoir param- 
eters conditioned to static data |8|. In some cases, information concerning geologic 
facies may also be incorporated |2|. On the other hand, with the aid of downhole 
permanent sensors, measurements of reservoir flow can be continuously acquired. In 
a Bayesian framework, these flow measurements, the reservoir model and the prior 
distribution of the petrophysical properties are combined to characterize the posterior 
distribution of the reservoir parameters given dynamic (flow) data. This posterior dis- 
tribution quantifies the uncertainty in the reservoir predictions and it is essential for 
assessing the economical and environmental risk of oil recovery procedures. 

Markov Chain Monte Carlo (MCMC) methods are the standard techniques for 
sampling the posterior distribution described above. In particular, the Metropolis- 
Hasting variant of MCMC has been typically used for data assimilation in reser- 
voir models [ 3l [T0lfT9l[22l[T7l [9l[T2ll . In general, the posterior distribution that arises 
from Bayesian data assimilation does not admit a finite-dimensional parametrization, 
with the exception of very few particular cases such as the linear and Gaussian case. 
Therefore, strictly speaking, an infinite number of samples are required to define it. 
This implies, in practice, that hundreds of thousands or even millions of reservoir 
simulations may be required for standard MCMC methods to accurately characterize 
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the posterior distribution. This computational disadvantage of standard MCMC ap- 
proaches has given rise to the development of more computationally efficient MCMC 
techniques |7,9, 19, 12|. With increasing advance of computational power, the afore- 
mentioned MCMC methods may potentially become viable tools for reservoir man- 
agement in the decades to come. At the present time, however, it is essential to de- 
velop techniques that provide a reasonable characterization of the posterior with a 
computational cost that involves a limited number of reservoir model runs. It then 
comes as no surprise that, in the previous years, research on data assimilation for 
uncertainty quantification (UQ) applications has mainly focused on improving the 
efficiency and accuracy of ad-hoc ensemble-based techniques that provide approxi- 
mations of the posterior distribution based on Gaussian assumptions. 

As pointed out in the recent literature review of | 21 1, there are three main ap- 
proaches that have been consistently adopted for sampling approximations of the 
posterior distribution: (i) linearization around the maximum a posteriori (MAP) es- 
timate (LMAP), (ii) randomized maximum likelihood (RML) method and (iii) en- 
semble Kalman filter (EnKF). Under a Gaussian prior and a linear model, it can be 
shown that all these techniques provide samples of the posterior distribution 
[T6l . We therefore refer to those methods as Gaussian approximations of the pos- 
terior. For standard (nonlinear) reservoir models, the mathematical structure of the 
approximation provided by the three approaches is still unknown. Nonetheless, the 
aforementioned methods are widely applied for generating model parameters condi- 
tioned to dynamics data, which are then used for statistical analysis of reservoir per- 
formance. Consequently, in the Bayesian framework, optimal decision-making and 
risk management depend on the quality of the underlying Gaussian approximations 
of the posterior. It is therefore, fundamental to understand the accuracy and conver- 
gence properties of those Gaussian approximations in order to interpret predictions of 
uncertainty made using them, and in order to develop improved methodologies from 
them. Rigorous numerical studies of these Gaussian approximate algorithms can shed 
light on these issues and can point us towards theoretical analyses. In this paper we 
therefore provide a numerical evaluation of the performance of the ad hoc Gaussian 
approximate algorithms LMAP, RML, and variants of the EnKF methodology, by 
using an expensive, but full resolved, MCMC simulation as our gold standard. This 
approach is analogous to the recent study of similar Gaussian approximate algorithms 
arising in the context of atmospheric data assimilation 1151 . 



1 . 1 Literature review 

Although the theoretical aspects of the approximation properties of LMAP, RML and 
EnKF are unknown for the case of nonlinear models, some numerical investigations 
have been performed 1 17,3|. To our best knowledge, only the work in 1 17] provides 
an evaluation of approximate methods for sampling the posterior. To accomplish this 
goal, Liu et-al ITTtII use a standard random walk MCMC method to generate accurate 
samples of the posterior. These, in turn, are used as gold-standard against to which 
compare the performance of the approximate methods: LMAP, RML and pilot point 
methods. In their evaluation, Liu et-al use synthetic data from a single-phase one- 
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dimensional reservoir discretized with 20 gridblocks. The main conclusion of ifTTl 
is that RML provides the best uncertainty quantification when compared against the 
MCMC gold standard. In particular, RML outperforms pilot point methods whose 
application to reservoir data assimilation problems has been lately abandoned. 

Within the context of evaluating the uncertainty quantification properties of data 
assimilation techniques, it is relevant to mention the work of 1 3 1 where several meth- 
ods were compared for the synthetic PUNQ-S3 reservoir model. The main goal of 1 3 1, 
however, is to evaluate the ability of the corresponding techniques to provide confi- 
dence of interval that contain the truth estimate. Among those techniques, MCMC is 
the only one that can potentially provide accurate samples from the Bayesian poste- 
rior. Unfortunately, as stated in |3|, the MCMC results are not conclusive due to the 
small chain used for their experiments. Thus, the evaluation in 1 3 1 does not provide an 
evaluation in the strict sense of a Bayesian framework. It is worth mentioning that the 
work of 1 3 1 and 1 17| appeared almost a decade ago when EnKF had just been intro- 
duced to the history matching community 1 1 1, and so EnKF was not assessed in (TTl 
[3 1. Almost a decade later and after hundreds of publications, EnKF is now perhaps the 
only computationally feasible technique for real-time data assimilation in petroleum 
reservoirs. For a comprehensive review of EnKF for reservoir applications, we refer 
the reader to fH . 

In the recent work of |[T2ll . the EnKF is combined with an MCMC algorithm to 
improve the efficiency of standard MCMC methods for sampling the posterior in a 
Bayesian framework. Although the analysis of the approximate properties of EnKF is 
not the main goal of [12|, an implicit evaluation of EnKF is displayed. Indeed, under 
the assumption that the MCMC samples of |[T2l provide an accurate characterization 
of the posterior, then (TT\ provides a partial assessment of EnKF for approximating 
the posterior. While the aforementioned work exposes severe limitations of the EnKF 
for sampling the posterior, an evaluation with respect to other approximate methods 
remains nonexistent. 

In the context of evaluating the uncertainty quantification properties of Gaussian 
approximations of the posterior, we highlight the work of II 3 [1251 . For the PUNQ-S3 
model mentioned above, 1 13] compares the performance of EnKF and RML. In | 25l, 
a new SVD-based RML is introduced and compared against EnKF. It is important 
to mention, however, that I13II25I evaluate the capability of these Gaussian approx- 
imations for reproducing the truth within the confidence of interval of the technique 
under consideration. While this is a natural strategy for assessing uncertainty quantifi- 
cation properties, it is an insufficient evaluation from the perspective of the Bayesian 
framework. In other words, capturing the truth within the spread of model predic- 
tions obtained with a Gaussian approximation does not ensure that the spread cor- 
rectly represents the uncertainty quantified by the posterior distribution of Bayesian 
data assimilation. It is therefore essential to develop a controlled experiment where 
standard Gaussian approximation can be tested against the solution to the Bayesian 
data assimilation problem: the posterior distribution. 
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1.2 The proposed work 

In this paper we propose the numerical evaluation of LMAP, RML and some standard 
versions of ensemble Kalman filter-type methods for approximating the posterior dis- 
tribution within the Bayesian framework of data assimilation. We characterize the 
posterior distribution by using a state-of-the art MCMC method that provides a gold- 
standard against which to compare the aforementioned Gaussian approximations. In 
this sense, our work has a similar goal to the one of |17|. However, there are two 
recent algorithmic developments which motivate our desire to revisit the perspective 
introduced in 117 1. The first, discussed in detail below, is that MCMC methodol- 
ogy has evolved significantly, enabling the study of considerably more sophisticated 
forward models and more high dimensional parameterizations of the unknown petro- 
physical quantities, leading to greater realism. The second new aspect of our work is 
the assessment of ensemble Kalman filter- type methods. In particular we consider the 
most standard EnKF implementations, namely: (i) the perturbed observation EnKF; 
and (ii) the square root filter EnSRF of 123 111 01 . In both cases we also evaluate the 
effect of performing distance-based localization f5l[TT1l. 

We emphasize that, in contrast to other approaches 1 131125 [ [3l[5l[TT1l where the aim 
is to recover the truth within the confidence interval of relevant quantities and to 
history-match data, here we are interested in assessing the performance for charac- 
terizing the posterior distribution. Evaluation of algorithms by their ability to recover 
the truth within a confidence interval has the disadvantage of entangling the approx- 
imation properties of the history matching algorithm employed with the information 
content of the particular observations used, making it hard to evaluate the effect of 
the ad hoc approximations alone. Our assessment of the ability to probe the Bayesian 
posterior distribution is conducted by quantifying the capability of each Gaussian 
approximation in reproducing the mean and the variance of the posterior distribu- 
tion associated to a data assimilation problem. Two prototypical reservoir models are 
used, both in two spatial dimensions: (i) slightly compressible single-phase Darcy 
flow model and (ii) incompressible oil- water reservoir model. In both models, the 
unknown is the logarithm of the absolute permeability of the reservoir u = log^. 
For the single-phase model, pressure data is collected from production wells. For the 
oil-water model, total flow rate is measured at the production wells while bottom- 
hole pressure is collected at the injection wells. For both models considered here, the 
corresponding parameter-to-output map G{u) is nonlinear. Thus, even when the prior 
distribution is Gaussian, the Bayesian posterior is non-Gaussian. This constitutes the 
ideal scenario to evaluate approximation properties of the techniques of interest, pro- 
vided that a gold standard is obtained from accurately sampling the posterior as we 
describe below. 

As we indicated earlier, some MCMC methods have been used for sampling the 
Bayesian posterior distribution in reservoir models. However, some of these meth- 
ods 1 9, 19 1 rely on reducing the parameter space (e.g. via truncating Karhunen-Loeve 
expansion) and/or upscaling the model to reduce the computational cost of the al- 
gorithm. In the present experiment, however, we are interested in the more general 
case where no reduction of the parameter space is possible. In other words, we as- 
sume that the petrophysical property is unknown at each at the location of the phys- 
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ical domain of the reservoir. In this case, a standard MCMC technique Hke the one 
used in |17| is computationally prohibitive for larger size problems like the ones 
considered here. To overcome this difficulty, we take advantage of recent develop- 
ments in MCMC methodology and sample the posterior by applying the precondi- 
tioned Crank-Nicolson (pCN) MCMC method described in |7 1, and derived from the 
infinite-dimensional Bayesian framework developed in |24|. In contrast to standard 
MCMC methods, the acceptance probability in the pCN-MCMC method is invari- 
ant with respect to the dimension of the parameter space, therefore making pCN- 
MCMC ideal for large-scale problems like the one studied here. The advantage of 
using pCN-MCMC over standard MCMC for data assimilation in some geophysical 
problems has been shown in |7 |. For petroleum reservoir applications, the compu- 
tational efficiency of pCN-MCMC with respect to other existing methods deserves 
further investigation. Nevertheless, in the present work we apply pCN-MCMC and 
provide numerical evidence of convergence so that the corresponding realizations 
generated with pCN-MCMC are samples from the Bayesian posterior. These, in turn, 
provide a gold- standard against to which compare LMAP, RML and standard ver- 
sions of ensemble Kalman filter- type of methods. The proposed numerical evaluation 
of the approximation techniques has two concrete goals: (i) assess the capability to 
recover the mean and variance of the posterior and (ii) evaluate the performance for 
reproducing the uncertainty (quantified by the posterior) in the reservoir model pre- 
dictions. 

In Section|2]we describe the prototypical reservoir models that define the forward 
operators that we use for data assimilation. The Bayesian framework for data assim- 
ilation as well as the MCMC methodology for sampling the posterior are introduced 
in Section [3] Methodologies based on Gaussian approximations of the posterior are 
described in Section]?] In Section [5] we report and discuss the numerical results and 
comparisons of our synthetic experiments. The summary and final remarks are pre- 
sented in Section [6l 



2 Forward Reservoir Models 

In this section we briefly outline the forward (reservoir) models that we use for the 
evaluation of Gaussian approximations of the posterior. On the one hand, we consider 
simplified two-dimensional models for which a forward model run is computation- 
ally inexpensive and therefore feasible for the highly computationally challenging 
MCMC method. On the other hand, by sharing the mathematical structure of more 
sophisticated models, the models we describe below are ideal for prototyping and 
evaluating performance in a controlled fashion. For each of the following models, we 
consider a two-dimensional reservoir whose physical domain, absolute permeability 
and porosity are denoted by Z), K and respectively. The interval [0, T] (T > 0) is 
the time interval of interest for the flow simulation. For each reservoir model, we 
define the forward operator G : X ^ that maps the parameter space X into the 
observation space R^. In other words, G{u) is the model predictions corresponding 
to the parameter ueX. For simplicity we assume that the only unknown parameter is 
u = logK. Nevertheless, all the techniques and implementations that we describe in 
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subsequent sections can be extended to include additional parameters, such as poros- 
ity which is routinely estimated alongside permeability in many practical scenarios. 



2.1 Single-phase Darcy flow 

We consider a single-phase reservoir where oil is produced at A^^^ production wells 
operated under prescribed production rates {q^{t)}f^^ (t G [0,7]). The flow in the 
reservoir is described in terms of the (state variable) fluid pressure p{x,t) ({x^t) G 
D X [0, T]) which is governed by the following equation [6J 

dn 

c0^-V-^"V/7= y ^^5(x-xO inDx (0,r] (1) 
/=i 

where u = logK, c is the total compressibility and 5(x — x^) is (a possibly mollified) 
Dirac delta centered at the /-th well with location denoted by xK In addition, we 
consider the following boundary and initial conditions 

-^"V/?-n = onaZ)x(0,r], (2) 
p = p^ inZ)x{0}. (3) 

As we indicated earlier, the only uncertain parameter in log-permeability 
u. Therefore, the additional model parameters c, (j),v, po and the geometry D in ([T])- 
^ are prescribed. 

In order to construct the forward operator, we first define the model predictions 
of measurements. Let us then assume that Nm measurements of pressure from wells 
are collected at times ti,...,tM- We define the measurement functional 

mUp) = p{x\tn) (4) 

that corresponds to the fluid pressure at time tn and well location x^ . For the exposition 
of subsequent sections we also define the vector 

M,{p) = {Ml{p),...,M^'^{p)). (5) 

We finally define N = N^Nm, i.e. the total number of observations from wells, and 
construct the forward operator 



G{u) = {M,{p),...,MnM) 
Note that /? in ([6]) depends on u via 



(6) 
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2.2 Oil- water reservoir model 



We consider an oil-water reservoir model initially saturated with oil and irreducible 
water. Let us index by 7 = w and y = o the water and oil phase respectively. We 
assume that both fluids and the rock are incompressible. We are interested in a wa- 
terflood process where water is injected at Ni injection wells located at {x\}f^^ . 
Water and oil are produced at A^^ production wells located 3.t{xp}f^^. Additionally, 
we assume that injection wells are operated under prescribed rates {^^(Ol/Ei while 
production wells are constrained to bottom hole pressure denoted by {^^/^(Ol/^i • 
The reservoir dynamics in [0, T] are described by the (state variables) water satura- 
tion and the pressure denoted by s{x,t) and p{x^t) respectively ((x,^) G Z) x [0,7]). 
From standard arguments it can be shown that {s^p) is the solution to the following 
system (61 



- V • A {s)e''Vp = £ ^^5(jc - 4) + ^ (O^X {s) [P[ 

1=1 1=1 



-p]5{x-x^p), (7) 



/=i 



1=1 



inDx (0, r], where 5{x — Xp) and 5{x — Xj) are the (possibly mollified) Dirac deltas 
as defined before, {co^}fZi constants related to the well model ||6J. Additionally 
Xw{s) and A(^) denote the water and total mobility defined by 



kro{s) 



(9) 



where kry{s) and fly denote the relative permeability and the viscosity of the 7-phase 
fluid, respectively. Furthermore, we assume that 

n 2 



1 



kro{s) 



1 



1 



(10) 



where a^^Go G (0, 1], Sf^y; is the irreducible water saturation and Sor is the residual oil 
saturation. We additionally prescribe initial conditions for pressure and water satura- 
tion 

p = po, s = so mDx{0} (11) 
For simplicity, no-flow boundary conditions are prescribed on the reservoir boundary 
-^"A(^)V;?-n = onaZ)x(0,r], (12) 
-e''?i^{s)Vp-n = ondDx{0,T]. (13) 

Let us assume that there are Nm measurement times denoted as before {tn}^^^. We 
assume measurements of bottom-hole pressure are collected at the injection wells at 



Nm 



J. This, according to Peaceman's well-model |6|, is defined by 



M'/{p,s) 



(o'A(^(4,f„)) 



-p{x\,tn) 



(14) 
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for / = 1 , . . . , A// and n=l,... .Nm- Analogously, we consider measurements of total 
flow rate at the production wells 

M'/{p,s) = q'''{t„) = (D'?i{s{x'p,tn))[Plh{t„) -p{x'p,t„)] (15) 

for Z = 1, . . . ,Np and n= 1, . . . ,A^m- Let us denote by A^vv = Np-\-Ni the total number 
of wells, and define the A^^^ -dimensional vector 

Mn{p,s) = {My{p,s),...,M^'\p,s),M'/{p,s),...,M^^^\p,s)) (16) 

The total number of measurements N is defined as before and the forward map G : 
X ^ is then given by expression 

G{u) = {M^{p,s),...,Mnm{p.s)) (17) 

which in this case comprises the production data obtained from production and injec- 
tion wells at the measurement times. 

With both forward models written in terms of the forward operator G, in the next 
section we describe the Bayesian inverse problem of finding u given noisy observa- 
tions of G{u). 



3 The Bayesian framework 

We assume that the unknown parameter u and the data G F are related by 

y = G{u)^n (18) 

where G is the forward map introduced in the previous section, and i] G is a vector 
of random noise. Informally the Bayesian approach to inversion proceeds by placing 
a prior probability distribution ¥{u) on u and assuming an independent probability 
distribution on 77. The likelihood, namely the probability of the observed data y given 
a particular instance of the unknown parameter u, is then denoted Bayes' 
rule then states that the posterior probability of the unknown parameter u, given the 
observed data y, denoted by P(w|y), is determined by the formula 

In this section we formulate this precisely in the case where the unknown parameter 
is a function. 

We define the norm 1 1 • 1 1^ = 1 ) 1 1 for any covariance operator B and we use 
this notation throughout the paper, in particular in the observation space, with B = r, 
and in the log-permeability space with B = C. For simplicity and following conven- 
tion in the field, we will not distinguish notationally between the random variable and 
its realization, except in the case of the truth, which will be important to distinguish 
by in subsequent sections in which it will be prescribed and known. 
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3.1 An infinite-dimensional Bayesian framework 

We are interested in the inverse problem of characterizing the posterior distribution of 
the unknown log-permedihility function u given finite -dimensional observational data 
denoted by y. We approach this inverse problem by means of the infinite-dimensional 
Bayesian framework of |24 1 that we now briefly describe. Assume that u where 
X is a Hilbert space, and denote by {Iq the prior probability measure on u. We assume 
that the unknown u and the data j G F are related by ([18]). For simplicity, we as- 



sume that 7] ~ A^(0,r). Then, the rigorous interpretation of ( 19 ) is that the posterior 
distribution on u\y is given by measure /i satisfying 

^ (u) = ^^P(-^(^^>-)) (20) 

djio JxQxp{-^{u,y))jio{du) 



where the left hand side of ([20]) is the Radon-Nikodym derivative of the posterior 
distribution }j.{u) = with respect to the prior /io and 

<P{u,y) = \\\y-G{u)\\l. 

A sufficient condition for this to be well-defined is that is continuous as a 

mapping from X into R for each fixed y, and that IAq{X) = 1 so that functions drawn 
from jio are in X almost surely. The formula ( |20| then holds in infinite-dimensions, 
exhibiting the posterior density with respect to the prior; in practical terms this means 
that posterior expectations can be found by reweighting prior expectations by the 



right-hand side of (20). 



The posterior distribution fi quantifies the uncertainty of the logarithm of the 
absolute permeability given production data, normalized by the prior. Since G is non- 
linear, the posterior is non-Gaussian even when the prior /io is Gaussian. Thus there 
is no useful closed- form expression for the posterior distribution and it must be char- 
acterized by means of sampling. 

Before we describe the approach for sampling ji, we note that, if we assume the 
prior /io is Gaussian with mean u and covariance C, it follows that the maximum a 
posteriori (MAP) estimate umap is the minimizer of the functional 

J{u) = 0{u,y)^^\\u-u\\l (21) 

The MAP estimator is the typical estimate computed in standard history matching 
problems where the goal is to recover the truth by fitting historic production data. 
Note that the Bayesian approach thus subsumes this classical approach to inversion, 
whilst also providing rigorous quantification of uncertainty of predictions, given clear 
assumptions on the prior and noise probabilities. 



3.2 Sampling the posterior with MCMC 



A state-of-the-art class of MCMC methods that sample from fi defined in ( 20 ) has 
been proposed in |7 1. In particular, we consider the following preconditioned Crank- 
Nicolson (pCN) MCMC |7, Section 5.2]. 
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Algorithm 1 (pCN-MCMC ) Take u^^^ ^N{u,C),n = 1, and j3 G (0, 1). Then, 

(1) pcN proposal Generate ufrom 

VT^^wW + (l- VT^^)i/ + j3(^, with ^^N{0,C) (22) 

(2) Set u^^^ — u with probability a{u^^u) and u^^^ = with probability l—a{u^^u), 
where 

a{u,v) = min|l,exp(0(w,3;) - 0{v,y))^ (23) 

(3) n\-^n-\-l and repeat. 

All the probabilities are generated independently of one another, leading to a 
Markov chain which is invariant with respect to {l. Notice that the small change 
in proposal, when compared with the standard random walk MCMC ifTTl . results 
in an acceptance probability defined via differences of and not /. Because <P is 
finite with respect to /i, whilst / is not, this leads to a considerably improved algo- 
rithm which has desirable J/m(X) -independent properties when implemented on a 
sequence of approximating problems with dim{X) oo. Therefore, for large dim{X) 
like the one considered here, pCN-MCMC provides a more robust and efficient tech- 
nique than the standard MCMC approaches. Numerical evidence of these scaling 
properties can be found in 1 7 1 . 

Based on the forward models of Section [2j the purpose of the present work is to 
design synthetic experiments for solving the Bayesian data assimilation problem, i.e. 
finding the posterior distribution. By implementing the pcN-MCMC algorithm, we 
characterize this posterior and generate a gold standard against to which compare the 
Gaussian approximations that we introduce in the following section. 



4 Gaussian approximations of the posterior 

In this section we introduce some standard ad-hoc methods that use Gaussian approx- 



imations to sample the posterior distribution (20). In particular, we consider LMAP, 
RML, EnKF and EnSRF which have been typically used for history matching and un- 
certainty quantification in the Bayesian framework of data assimilation of petroleum 
reservoirs. While many variants of the aforementioned techniques can be found in 
the literature |21 1, here we focus on the most standard and typical implementations 
used for history matching. For each of the aforementioned techniques, the objective 
of the subsequent description is twofold. First, we introduce the algorithm and the as- 
sociated computational cost. Second, we indicate the type of Gaussian approximation 
made for the definition of the technique under consideration. 
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4.1 Linearization around the MAP (LMAP) 

As described in Section [3] the minimizer of / introduced in pT] ) defines the MAP 
estimator, i.e. 

umap = argmin^|0(w,};) -^^\\u-u\ 1^} (24) 
We can further define, 

Cmap = C-CQ\QCQ^ ^ry'QC (25) 

where Q = DG{umap) is the Frechet derivative of G evaluated at w = umap- The Hn- 
earization around the MAP | 20, Section 10.5] consists of approximating the posterior 



11 in Q by ;U ^ N{umap,Cmap)- 

The LMAP algorithm approximates the posterior with an ensemble of Nen real- 
izations from N{umap-,Cmap) |20, Section 10.5]. This ensemble can then be used to 
approximate integrals with respect to the posterior of nonlinear functions of u. Note 
that when G is linear, ji = N{umap^Cmap) and then umap and Cmap are the mean and 
covariance of the posterior. The algorithm, however, is well-defined in general, and 
may thus be applied to cases in which G is nonlinear. 

Algorithm 2 (LMAP) (1) Compute umap <^nd Cmap from and ^^5\respectively. 

(2) Compute the Chole sky factor L of Cmap , i-^- Cmap = LlT^ 

(3) For j G {1, . . .Nen}y generate 

u^^^ =UMAP^L^z^j^ (26) 

where z^^^ -A^(0,/). 



Samples generated by (26) are draws from N^umapiCmap) and so the ensemble 
W^^^}^j=\ provides an approximation to N{umap-,Cmap) and, hence the posterior. 
The computational cost of LMAP depends on the cost of computing the MAP 



estimator (24) and the factorization of Cmap- For the present work, we develop im- 
plementations of the Levenberg-Marquardt algorithm of [W, Section 8.4] with the 
stopping criteria given by (8.82) and (8.83) from [20, Section 8.5]. It is worth men- 
tioning that, within the context of reservoir characterizations, multiple techniques 
for computing the MAP estimator have been widely studied (e.g. BFGS, LBFGS, 
Gauss-Newton) |20 , Section 8]. It is of interest to evaluate the optimal minimization 
technique, but this is beyond the scope of our present work. 



4.2 Randomized Maximum Likelihood (RML) 

The RML technique was developed as an attempt to accelerate MCMC methods for 
sampling the posterior from Bayesian data assimilation in reservoir models ll22l . The 
main idea of RML is to construct an ensemble of MAP estimators from random- 
ized objective functions ( [24] ). A standard implementation of RML is presented in the 
following algorithm 
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Algorithm 3 (RML) For j e {1, . . . ,Ne} 

(1) Generate u^^^ ~ N{u^ C) 

(2) Define y^j^ = j + r](-^") with rf^j^ - A^(0,r). 

(3) Compute 

= argmin, [^{u, ) ^ \\\u - u^^^ . (27) 

In the case where G is Hnear, the RML algorithm can be shown to sample the posterior 
distribution (i.e. from fi = N{umap-,Cmap)) For the nonlinear case of interest 
here, the RML algorithm provides an approximation the nature of which, to the best 
of our knowledge, has not been systematically understood. 

Note that for each ensemble member, RML requires the solution to the minimiza- 



tion problem (27). Nevertheless, since each minimization problem is independent 
from one another, RML is then embarrassingly parallelizable. Each of those min- 
imization problems has the same structure as the one that we solve for the MAP 
estimator ( [24| ). For a relatively small problem the computational cost of computing 
L in LMAP, given Q which has already been constructed while solving p4| ), as well 
as the generation of ([26]), are negligible compared to the cost of one forward model 
evaluation. Thus the computational cost of RML is roughly Ne times the computa- 
tional cost of LMAP, although the effect of the multiplier Ne can be ameliorated in a 
parallel context. 

Similarly to our implementation of the MAP, for RML we consider the Levenberg- 
Marquardt method and the corresponding stopping criteria mentioned above. Improv- 
ing the optimization technique required for ([27]) can reduce the overall computational 
cost of RML. Alternative methods to reduce the computational cost of RML by means 
of a truncated S VD approach can be found in [25] . 



4.3 Ensemble Kalman Filter (EnKF) 

Ensemble methods based on the Kalman filter have been extensively applied for 
Bayesian data assimilation in petroleum reservoir applications. For a complete re- 
view of most of the EnKF implementations we refer the reader to the monograph of 
LI |. In this section we briefly discuss some relevant aspects of EnKF in the context of 
history matching of petroleum reservoirs. These ensemble Kalman filter- type of algo- 
rithms, make Gaussian approximations in a sequential manner as we describe below. 
As a result, for the general case, those techniques do not provide correct sampling of 



the posterior (20). Nevertheless, due to its ease of implementation and low compu- 
tational cost, ensemble Kalman filter-type of methods are arguably the only feasible 
techniques for online data assimilation in subsurface applications. 

4.3.1 Introduction and Main Algorithm 

In order to introduce the algorithms, we first consider a sequential formulation of the 
reservoir model. In particular, let us define v„ the state variable at time tn and S the 
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State space. For example, for the single-phase model of Section [2j v„ = p{x^tn). We 
define the solution operator Wn '. S xX ^ S 

Vn = %iVn-Uu) (28) 

which, for a given parameter u, maps the state variable from time t = t^-i to t = t^. 
In practice, ^ is simply the numerical solver that arises from the time discretization 
of the reservoir model under consideration. In addition, we assume that data is given 
at each of these points in time and is correlated between times only through the state 
itself, i.e. 

yn=Mn{Vn)^rin, (29) 

where rfn ^ N{0,rn) and Mn S ^ is the measurement functional acting on the 
state variable at time t = tn. For the models of Section [2j M„ is defined by ^ and 
( p^ respectively and A^^ is the number of total wells. Define, 

u 

^n{z)=\ %{V,U) |. (30) 

Since the permeability in the forward reservoir model does not change in time, it 
follows that ([28])-([29|) can be written as 

Zn=^n{Zn-l), (31) 
yn =HZn^nn- (32) 

where H = (0,0,7). We now consider the following standard perturbed observation 
version of EnKF ifB . 

Algorithm 4 (EnKF) Construct an initial ensemble 

zf""' = i "vo ] (33) 
\Moivo)J 

where {mI/^I^^i ~ Mo and vq is the initial condition for the state variable. For j = 
l,...,N^ 




(1) Prediction Step: Propagate the ensemble of particles forward under {31) giving 

= ye{l,...,A^J (34) 

From this ensemble we define a sample mean and covariance as follows: 

z{i = ^^Y.jL\Zn''^^ (35) 

Cl = ^L%,zi'''\zi^''V (36) 
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(2) Analysis step: Compute the updated ensembles 



. Shf) 

■ Zn 



(37) 



where 



K„=C(H^(HC[H^+r„y 



(38) 



and 



AJ) 

yn 



U) 



(39) 



Here the rfn'' are chosen i.i.d. In order to discuss the computational cost of the EnKF 
algorithm, let us first note that all the vectors and matrices involved have block struc- 
ture inherited from the structure of the space Z = X x S x . For example, we have 



,U,f) 



(hf) 



( 





(ui\ 











We also have 









( r^uj 

























Then, expression ( 37 ) can be written as 



-r)-\yi^^-MMv^1A'-1))) 



(40) 



The submatrices in cl needed for (|40| are given by 



X^Ne ,,UJ) (.SjJ)\T 



— Ne 



1 ^^n 



(41) 
(42) 
(43) 



We recall that w G where A^^^ is the number of wells. Typically A^^^ is much 
smaller than the dimensions of the (discretized) parameter space X. Consequently, 
the computational cost of constructing and C^^ and inverting the (C^^ +1^)"^ 
in ( [4o| is negligible compared to the cost of computing En {z^^-i ) ' which from ( 30 ) we 
can see is mainly determined by the cost of ^(v^^{^ , u^^^^) (i.e. running the reservoir 
simulator in the time-interval [tn-\^ tn] • Therefore, the computational cost of the EnKF 
is approximately Ne times the cost of a forward model simulation. 
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4.3.2 Derivation by Gaussian Approximation of the Filtering Distribution 

We now indicate how a Gaussian approximation gives rise to the EnKF algorithm 
presented above. We start by defining the conditional measures for ,^2 ^ A/^ 

^ln,\n,{Zn,)=nZn^{yk}Zl) (44) 

In the filtering approach, given the prior distribution lin\n-\ of Zn given data up to the 
previous time t = tn-\ is combined with data provided at the current time time t = tn 
to define the posterior distribution of Zn given data up to the current time t = tn. 
The latter can be obtained from Bayes rule: 

ocexp{-0„(z)} (45) 



where 

1 



Mz) = -\\yn-Hz\\'r. (46) 



The EnKF approach then assumes that is the Gaussian measure N{zi-,cl) 



where zt and cl are the ensemble mean and covariance defined in (35) and (36) 
respectively. Given this Gaussian assumption, it is not difficult to see that ( [45] ) implies 
thatAi^l^(z)=A^(z^,C^) with 

Zn^ =zi^^^Kn{yn-mi^^) (47) 

={I-Kn)Cl (48) 

and Kyi defined in ([38] ). In [16, Appendix A] it has been shown that the ensemble 
updates defined in (pTI) are samples from lin\n{z) = N{z^^C^). In fact, ^6, Appendix 
B] shows that the analysis step ([37]) can be derived from an application of RML under 
the Gaussian approximation ~ N{zljCl). Indeed, it is straight forward to 

show that ( [37] ) can be obtained from 



Zn 



-argminillr, ^(ji'"^ + ||(C{)-2(,-,i^'^))|p) (49) 



which is a sequential version of ( 27 ), for the augmented state z with a prior 
and the linear measurement operator H. 

For our evaluation and comparison of techniques, we consider the outcome of 
the EnKF algorithm after all data has been assimilated in the time interval [0, T]. In 
other words, we are interested in lJ^ri\N,n i^n) =^{zn\ {yk}kZi ) which corresponds to the 
probability of Zn after all data has been assimilated (recall Nm is the total number of 
assimilation times). Then, the posterior lin\Nm(^n) computed via the EnKF algorithm 
provides an approximation to jl defined in ([2Q|. 
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4.3.3 Further Modifications 

While the standard version of EnKF has been successfully applied for some history 
matching problems, several shortcomings due to sampling error have been identified. 
In particular, when a small ensemble is used, spurious correlations often cause gross 
over-estimation of the physical variables that EnKF aims at recovering (e.g. perme- 
ability). In addition to the issues caused by small sample size, standard EnKF with 
a small ensemble is suboptimal when a large amount of data are assimilated. This 
can be easily observed from the two following properties of EnKF. First, the ensem- 
ble updates ( [37] ), when projected into the parameter space are a linear combination 
of the initial ensemble members I16II18I . Second, the ensemble updates minimize 
( [49| ) which involves fitting data at each assimilation time. Therefore, when the prior 
ensemble is small, the EnKF updates cannot fit large amount of data within the sub- 
space generated by the prior ensemble. These shortcomings of using standard EnKF 
have given rise to several EnKF variants designed to reduce the spurious correlations 
described above as well as increasing the number of degrees of freedom. In this work 
we focus on the application of distance-based covariance localization which has re- 
cently been investigated in tSlfTTIl. In particular, the EnKF with localization that we 
implement for the forward models of Section[2]is given by the same EnKF algorithm 



described before, except that ( 38 ) is replaced by 



Kn=poClH^(H{poCl)H^ ^Fn) \ (50) 

Here p, to be defined below, is a positive-definite matrix which induces localization 
and the matrix p o Q is the Schur product between p and Q with entries defined 
by [p oCn]ij = [p]ij[Cn]ij- Du^ to the spurious correlations described above, matrix 
cl may become positive semi-definite and the parameter update then lies in smaller 
subspace than the one generated by the prior ensemble. With properly chosen p the 
matrix p ocl has full rank, and replacing cl with p ocl increases the dimension 
of the linear subspace where the parameter update is sought. This, in turn, results in 
a better estimation. In terms of the block structure previously described, covariance 
localization becomes 

= ^J^f^ + o cr {p„„ ocr+rnT' {y;!^ -M„i% {v^^jf} , ) ) ) (5 1 ) 

As in the covariance localization approach of 0, we consider only localization in the 
w-component (e.g. for the log-permeability updates). In other words, 

= ui^'f^ +p,„oCr{p^.oCr +rn)-\yi'^ -MMvit^^uit^^^^^ (52) 
vl^» = vi^'f'^ +cr{cr +rn)-H//^ -MMv\lf}J^f}))) (53) 
= +cr{Cr+rn)-\y^J^ -MMv^^f},u\lf}))) (54) 

Following the implementation of fTTl, each column of the localization matrix puw 
is defined as the fifth order compact function of Gaspari-Cohn 1 14 | localized at the 
corresponding measurement location. Each row of the matrix p^^ in ([52]) is obtained 
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from puw by projecting it on the corresponding measurement location. By construc- 
tion, puw and are positive definite. 

Recent pubHcations ifTTHSl have investigated optimal choices for the critical length 
of the correlation function used for distanced-based localization. The focus of those 
investigations is to improve the ability of the EnKF with localization to recover the 
truth within the confidence interval provided by the ensemble. In contrast to f^O, 
our goal is to assess the performance of EnKF with localization for reproducing the 
uncertainty quantified by the posterior. However, for the present work we consider 
a fixed critical length obtained from a simple trial-error procedure, that enables us 
to observe significant effect of covariance localization in characterizing the posterior 
distribution. While the optimal choice of covariance localization is beyond the scope 
of the present work, we recognize the importance for assessing optimal choices of 
covariance localization for providing better Gaussian approximation of the posterior 
distribution at a reasonable computational cost. Moreover, additional forms of covari- 
ance regularization (e.g. covariance inflation) should also be assessed. 



4.3.4 Ensemble square root filter (EnSRF) 



Sampling error that arises from perturbing the observations in standard EnKF has 
been often associated with a poor performance of history matching data. In order to 
avoid the aforementioned sampling error, an ensemble square root filter (EnSRF) is 
often used. Here we consider the following EnSRF |ITO| : 



Algorithm 5 (EnSRF) Construct an initial ensemble as in pJ] ). For j = 1, . . . ,A/^ 
(1) Prediction Step: Propagate the ensemble of particles forward under p7| yielding 



{34). Construct the sample mean and covariance from p5\-po\. Additionally 
define the deviations from the mean 



. liJ) 



(55) 



(2) Analysis step: Compute the updated mean Zn^ via formula \47\ with given 



by Consider the matrices AZn := 



Az:n''^\ with jth 



column Azn 
deviations, 



and AZ^ defined analogously. Compute the matrix with updated 



AZ^ = {I-Kr,H)Azi^^0 



(56) 



with 



(57) 



The updated ensemble is then obtained from the expression 



(58) 



Evaluation of Gaussian approximations for data assimilation in reservoir models 



19 



In expression ( |56| ), is 3. Ne x Ne mean-preserving orthogonal random matrix 
constructed as suggested in f23|. The mean-preserving property of ensures that 
TJJLi ^Zn'^"^ = and so the analyzed ensemble ( 58 ) has mean zi^^ as required. In con- 
trast to the EnKF, where ( [48] ) is only exactly sausned in the limit of arbitrarily large 
ensemble size, the sample covariance computed from the EnSRF (finite) ensemble 
updates exactly satisfy (48), therefore providing a better approximation of liri\n{z)', 
it is then hoped that this will lead to a better approximation to the posterior distri- 
bution ( [20] ) itself. A block structure similar to the one introduced before applies to 
the EnSRF. From this structure it is easy to appreciate that only small matrices are 
involved in the square root computations. Therefore, the computational cost of En- 
SRF is essentially the same as EnKF, i.e., Ne times the number of forward model 
evaluations. 

Even though the implementation of the EnSRF avoids the sampling error due to 
perturbing the observations, limitations related to the small ensemble size still apply. 
Distance-based localization can then be applied to the EnSRF as suggested in LIOJ . 
Concretely, we replace ^„ in ( 38 ) with ( 50 ) and ^„ in ( 55 ) with 



-T/2 



(59) 



Similar to the localization procedure for the EnKF, EnSRF is localized only in the 



^/-component (log-permeability ) of ( 47 ) and ( 54 ) with the localization matrix p de- 
scribed above. 



5 Numerical Results 

In this section we present the results of three numerical experiments for assessing the 
Gaussian approximations defined in Section [4] These experiments are described on 
the three subsections which follow, each of which is organized as follows: (i) details 
of the forward model and the generation of synthetic data are provided; (ii) numerical 
results from the gold-standard MCMC implementation described in Section [3] are 
discussed; (iii) the numerical results of Gaussian approximations are presented and 
the results in (iii) are compared against the results from (ii) in terms of their ability 
to reproduce mean and variance; (iv) we assess the performance of the Gaussian 
approximation at quantifying the uncertainty in reservoir model forecast. 



5.1 Single-Phase flow 

For the first experiment we consider the single-phase reservoir model of Section [2] 
on a square domain D = [0,L] x [0,L] with the production wells located at the points 
labeled by Pi , ... ,^9 in Figure [T] (middle). Relevant information of this model is dis- 
played in the first column of Table [Tj For each well term in the right hand side of ([T]) 
we prescribe a production rate of 85m^/day constant during the total simulation time 
of 50 days. 
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True log-permeability Well locations 




Fig. 1 Single-phase model. Left: True log-permeability [logm^]. Middle: Well configuration. Right: 
Gelman-Rubin diagnostic 



We consider a Gaussian prior distribution of log-permeability 

lio{u)=N{u,C) (60) 
where the covariance is defined by C = kA~^, with the operator A = —A defined on 

D{A) = {veH^{D)\'^v n = on dD, / v = 0} (61) 

Jd 

i.e. A is the negative Laplacian with no-flow boundary conditions and restricted to 



spatial average zero functions. The tunable parameters in (60) are defined as follows: 
u{x,y) = log (5 X lO^^m^) for all {x,y) eD,K = 2.0 and a = 1.3. In Figure|2]we show 



some realizations of the prior distribution (60). It is important to mention that other 



choices of C can also be used. In particular, C can be defined in terms of a standard 
correlation function (e.g. spherical, exponential, etc). Our choice, however, has the 
advantage that C becomes a diagonal operator in the spectral domain. Sampling from 
the prior on the spectral domain is straightforward and computationally inexpensive. 



This is a desirable property since at each iteration of MCMC (see equation (22)), a 
draw from the prior is generated for computing the proposal. The correlation function 
of draws from the prior is, of course, simply the Green's function of C. 

For the generation of synthetic pressure data, we first define the "true log-permeability" 
denoted by and displayed in Figure [l] (left). This "true log-permeability" is gen- 
erated from the prior distribution defined above. Synthetic data is generated by first 
solving ([l}-([3]) for p with u = uK Then, G(u^) is calculated from Finally, we 
add random error, i.e. v = G{u^) + 7] with 7] - A^(0, a^I) with a = 4 x 10^ Pa. The 
measurement times used in ^ are t\ =5,tn = lOn days, ^ = {2, . . . , 5}. 

Synthetic data are used in the pCN-MCMC Algorithm [T] with j3 = 0.015. Our 
MCMC results consist of 1 10 chains starting from independent draws from the prior 
distribution. After a burn-in period of 1 x 10^, each chain generates 5 x 10^ samples. 
For assessing the convergence of our chains, we consider the diagnostics suggested 
by Gelman and Rubin in |4 1. In Figure[T](right) we display the maximum of the poten- 
tial scale reduction factor (PSRF) and the multivariate potential scale reduction factor 
(MPSRF) for the smallest / = 16 frequencies that account for 76% of the total prior 
energy defined by e{J) — Y^j=\ ^k/Y.°j=\ where are the eigenvalues of the prior 
covariance C (ordered as Ai > A2 > . . .)• Convergence of the chains is achieved when 
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Table 1 Reservoir model description 





• 

single-pnase reservoir 


water-Oil reservoir 

llUlllUCl Oi WCllS^ 


water-Oil reservoir 

i^ldlgC llUlllUCl Oi WCllS^ 


L[m3] 


10^ 


2x 10^ 


5x 10^ 


c [Pa-^] 


10-8 


0.0 


0.0 


Vo [Pa s] 


10-2 


10-2 


10-2 


T [years] 


0.13 


5 


3.5 


' po [Pa] 


3.5 X 10^ 


2.5 X 10^ 


2.5 X 10^ 




not applicable 


0.2 


0.2 


Vw [Pa s] 


not applicable 


5 X 10-4 


5 X 10-4 




not applicable 


0.2 


0.2 


^ro 


not applicable 


0.2 


0.2 


""Hh [Pa] 


not applicable 


2.7 X 10^ 


2.0 X 10^ 


b ql [mVday] 


not applicable 


2.6 X 10^ 


1.8x 102 




not applicable 


0.3 


0.3 




not applicable 


0.9 


0.9 



Constant inQ.^ Constant in [0, T]. 




the maximum of the PSRF and the IMPSRF are close to one. From Figure [T] (right), the 
max(PSRF) and the IMPSRF have dropped below 1.1 after 5 x 10^ iterations where 
we stablish the convergence of our IMCIMC chains. From the numerical evidence of 
convergence of our chains, we conclude that the MCMC provides samples from the 
posterior. The associated mean and variance fields, denoted by Upos{x) and Oposix), 
are used as gold standard for the assessment of the Gaussian approximations that we 
discuss below. In Figure |3] we display some samples from the independent chains 
(i.e. uncorrected) obtained after convergence was achieved. Although there are sub- 
stantial differences among those realizations, some common spatial features can be 
observed. For example, note the high permeability region around wells Pi and P^. 
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I 



i 



£6.5 

£7 

£7.5 

£8 

£8.5 

£9 

£9.5 

■30 

■30.5 



Fig. 3 Single-phase model. Samples from the posterior distribution (characterized with MCMC) [logm 



Furthermore, the variabiHty is considerably lower than under the prior, as exhibited 
in Figure |2} this indicates that the data used is quite informative. 

The numerical implementation of the following Gaussian approximations are 
conducted for an ensemble of size Ne = 50: LMAP, RML, EnKF, EnKF with lo- 
calization, EnSRF and EnSRF with localization. It is important to emphasize that 
the mean and variance for (the sequential methods) EnKF, EnKF with localization, 
EnSRF and EnSRF with localization, are computed after all the measurements have 
been assimilated. In other words, exactly the same data used to sample the posterior 
via MCMC are also used for all the Gaussian approximations under consideration. 
From each of these approximations, we compute the mean and variance which we 
compare against the mean Upos and variance Opos of the posterior distribution gen- 
erated with the MCMC method. The mean an variance are shown in Figure |4] and 
Figure [5] respectively. In Table [2] we display the relative errors of the deviation of the 
mean with respect to u (the prior mean) and the relative error of the variance defined 
by 



{u — u) — {u 



pos 



\lHd) 



and 



^pos 



\lHd) 



\^ - ^POs\\l^{P) 
\\^POs\\l^(D) 



(62) 



respectively. In the previous expression, u and a are the mean and variance of the 
Gaussian approximation under consideration. The right column of Table |2] indicates 
the computational cost for computing each of the techniques in terms of forward 
model runs. As stated in Section |4j while the computational cost of the Kalman filter- 
type of methods is stable with respect to the details of the implementation, the cost 
of RML and LMAP depends crucially on the optimization technique used for solv- 
ing ([24]) and ( [27] ). Our implementation of the Levenberg-Marquardt technique cost 
around 5 forward model runs per iteration. Furthermore, in average, each of opti- 
mization problems ( [27] ) converged in 5 iterations and so the average computational 
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cost of (27) is 25 forward model runs. This computational cost can be potentially 
reduced by applying a more efficient optimization technique. 

Since we are assessing the Gaussian approximations only in terms of mean and 
variance, we can additionally measure the error (with respect to the posterior) of the 
exact mean and variance of N{umap^Cmap) given directly ( 24 ) and ( 25 ), respectively. 
The corresponding relative errors are provided in the "MAP" row in Table [2] From 
construction it is clear that the mean and covariance of LMAP are umap and Cmap 
for sufficiently large Ne. The results for this experiment indicate that N{umap-,Cmap) 
provides a good approximation to the posterior in terms of mean and variance. There- 
fore, more samples from LMAP can be generated at a negligible cost so that its mean 
and variance approaches umap and Cmap^ respectively. 

Table [2] also indicates that, among the all the Gaussian approximation with Ne = 
50, RML provides the best approximation in terms of the mean. The worst perfor- 
mance in terms of mean and variance was obtained with EnKF. However, consid- 
erable improvement was obtained by applying the localization approach described 
in the preceding section (see equation ([52])). We recall from exposition of Section 
[4] that EnSRF reduces the sampling error that arises from standard implementations 
EnKF where data are perturbed with noise. From Table [2] we observe that the effect 
of sampling error has a detrimental effect in the performance of EnKF for reproduc- 
ing the posterior distribution. More precisely, EnSRF outperformed the EnKF both 
in terms of mean and variance with respect to the posterior distribution. Note that the 
application of localization in the EnSRF (expression ( 55 )) further reduces the relative 
errors in the mean and variance. In fact, among all techniques with A/^ = 50, the best 
approximation in terms of variance is given by the EnSRF with localization. 

From the preceding comments it clear that sampling error causes severe limita- 
tions in the performance of EnKF and EnSRF. It is worth mentioning that issues of 
the EnKF and EnSRF due to sampling error have been often reported 1 5 , 21 1 and used 
as motivation for covariance regularization. However, this existing work is focused 
on (i) history matching production data (ii) recovering the true permeability and (iii) 
recovering data generated with the true permeability within the estimated confidence 
interval. In contrast, here we assess the performance in terms of the posterior distri- 
bution of the Bayesian framework. 

Since sampling error due to the small ensemble size severely limits the ability 
of EnKF to produce reasonable approximations of the posterior, we consider three 
more additional implementations of EnKF for larger ensembles: Ne = 1250, Ne = 
2500 and A^^ = 5000. These results appear at the end of Table [2] Note that A^^ = 
1250 corresponds to the case where the computational cost of EnKF coincides with 
the cost of our implementation of RML. While the performance of EnKF improved 
significantly forNe = 1250, RML still provides a better approximation in terms of the 
mean. In addition, we observe that although increasing the size of the ensemble may 
reduce the sampling error, this is not associated to the convergence to the posterior. 
Actually, it is clear from this experiment that the variance of EnKF for large A^^ seems 
to diverge from the variance of the posterior. 

With the previous results we are able to appreciate and evaluate the differences 
in the approximations provided by each of the Gaussian approximations of the poste- 
rior. This posterior is the conditional probability of the unknown (permeability) given 
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production data collected during the 50 days of simulation. For practical applications 
it is of particular interest to assess how different Gaussian approximations fare at re- 
producing the probability distribution of various predicted quantities, with respect to 
the posterior; in other words to assess how the approximate algorithms fare in the 
quantification of uncertainty in these predictions. The assessment of performance in 
terms of the distribution of predictions under the posterior is conducted by creating a 
new flow scenario as we now describe. Assume that after the initial 50 days of sim- 
ulation (that we used to generate synthetic data), we now drill new wells labeled by 
Pio, P\i,P\2 and Pi3 in Figure[T] These new wells are operated at constant production 
rate of 60m^/day during 100 days. During this 100 days of forecast, the old wells 
Pi, ... ,^9 are first shut-down for a pressure build-up time window of 50 days, fol- 
lowed by a constant production of 60m^/day during the rest 50 days. In Figure|6]we 
show, as a function of time, the pressure at the well locations Pg, Pio and P^. The first 
50 days corresponds to the data assimilation phase and the subsequent 100 days are 
the prediction. In the first row of Figure [6] we display the pressure obtained from the 
reservoir simulation with 100 permeabilities obtained from the prior distribution ( 60 ). 
The second row corresponds to the pressure obtained from the simulation with the 
samples from the posterior obtained from independent MCMC chains. Subsequent 
rows of Figure |6] corresponds to pressure obtained from simulating the permeabilities 
obtained from some of the Gaussian approximations under consideration. The verti- 
cal line divides data assimilation phase from the forecast. Additionally, since we are 
interested in the performance with respect to the posterior, for each curve presented 
in Figure [6] we include a red curve of the pressure at the corresponding well location 
obtained by simulating the mean of the posterior distribution (i.e. top-left field of 
Figure [4]). 

Note that the uncertainty quantified by the posterior and the Gaussian approxima- 
tions is considerably small at the wells Pi , . . . , where measurements were collected. 
In contrast, large uncertainty in the forecast is observed for the new wells Pio, . . . , P13 
for which data was not available during the data assimilation phase. Note for exam- 
ple that in Pii , the posterior is visually close to the prior, indicating the uninformative 
effect of the data at the location of Pi 1 . For the new wells where uncertainty is larger, 
we can appreciate the performance on the Gaussian approximations. Note from Fig- 
ure [6] that the EnKF (without localization) at well Pn underestimates the uncertainty 
in the model predictions. 

In Figure|7]we display the distribution of the pressure at the new wells Pio, . . . , P13 
at the final time ^ = 150 days. The horizontal line correspond to the value of the 
pressure at the corresponding location obtained from simulating the model with the 
posterior mean. From [6] and [7] we observe that LMAP and RML provide a better ap- 
proximation to the predicting distribution than the EnKF-based methods. Since RML 
produces the best approximation of the posterior in terms of mean with a reasonable 
approximation of the variance, the associated approximation of the predicting distri- 
bution is the most accurate among the techniques considered here. Our results with 
respect to the optimality of RML for this experiment are similar to those reported in 
1 17] where a single-phase reservoir model was also utilized. As we will see in the 
next experiments, a less favorable performance of RML is observed for a two-phase 
reservoir model. 
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Fig. 4 Single-phase model. Mean of the posterior distribution (characterized with MCMC) and Gaussian 
approximations [logm^] 



Table 2 Evaluation of Gaussian approximations on the single-phase model 



Method 




Relative error 


Relative error 


Computational cost 






in the mean 


in the variance e^y 


[Forward model runs] 


MCMC 




0.000 


0.000 


5.5 X 10^ 


MAP 




0.030 


0.094 


2.5 X 10^ 


LMAP (Ne = 50) 




0.179 


0.259 


2.5 X 10^ 


RML (Ne = 50) 




0.154 


0.258 


1.25 X 10^ 


EnKF (Ne = 50) 




0.643 


0.417 


5.0 X 10^ 


EnKF (localization, Ng - 


^50) 


0.546 


0.288 


5.0 X 10^ 


EnSRF {Ne = 50) 




0.519 


0.267 


5.0 X 10^ 


EnSRF (localization, Ne 


= 50) 


0.445 


0.208 


5.0 X 10^ 


EnKF {Ne = 1250) 




0.192 


0.075 


1.25 X 10^ 


EnKF {Ne = 2500) 




0.120 


0.089 


2.5 X 10^ 


EnKF {Ne = 5000) 




0.102 


0.094 


5.0 X 10^ 



5.2 Oil- water reservoir: Small number of wells 

In this subsection we consider the oil- water reservoir model described in Section |2l 
The reservoir is a square D = [0,L] x [0,L] with a five-spot well configuration con- 
sisting on production wells Pi , . . . and one injection well /i as displayed in Figure 
[8] middle (well P5 will play a role in later discussions). Relevant information con- 
cerning this model is displayed in Table [T] The prior distribution of log-permeability 
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Fig. 5 Single-phase model. Variance of the posterior distribution (characterized with MCMC) and Gaus- 
sian approximations [(logm^)^] 



is defined by the Gaussian measure defined in ( 60 ) with same parameters except K 
which for this experiment is K = 4.0. With these parameters, similar realizations to 
those of Figure [2] are obtained but with a range of variability with respect to the prior 
mean u increased by a factor of V2. 

Similarly to the previous example, for the generation of synthetic data we define 
the "true log-permeability" displayed in Figure[8](left). This "true log-permeability" 
is generated from the prior distribution described in the preceding paragraph. Note 
that is the same as the one used for the previous experiment (see Figure [l] (left)) 
but with the magnitude of — u multiplied by V2. The generation of synthetic data 
is now conducted by computing (p^s) from ([7|-([8]) with u = u^. Equation ( [it] ) is then 
used to compute G(w^) and zero mean Gaussian random error r] is added to generate 
data y = G{u^) + r]. The measurement times used in ( 14 )-( 15 ) are tn = 0.67n years, 
^ = {1,...,7}. According to the structure of (p^-( 17 ), T] has the following form 



1,/ 



Ni,I 1,P Np,P 
1 'In 1 'In 1 ' ' ' 1 'In 



We generate the components of r] as follows 



?7^''' -A^(0,(3.2x lO'^Pa)^) 



2P 3 P 4 P 
^Vn T^n 



77„''^~A?(0,(0.25mVday)2) 
A^(0,(0.02mVday)2). 



(63) 



(64) 



for all ^ G { 1 , . . . , 7}. In the previous definitions we consider larger measurement error 
at the production well Pi since larger total flow rates are obtained. This is caused by 
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Fig. 6 Single-phase model. Pressure from wells P\i (right column), (middle column) and Pi (left col- 
umn) simulated with permeabilities sampled from (top to bottom rows) the prior, the posterior, LMAP, 
RML, EnKF and EnKF with localization. 
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Fig. 7 Single-phase model. Distribution of pressure at wells Pio, ■■■,P\3 at the final time of simulation 
f = 150 days 



the larger permeability region around Pi and so the early water breakthrough at this 
well. 

The pCN-MCMC Algorithm [T] with j3 = 0.015 is appHed to this problem with the 
synthetic data previously described. 110 chains are generated starting from indepen- 
dent draws from the prior distribution. A burn-in period of 1.5 x 10^ was observed 
after which the chains were run 5x10^ iterations. By using the same Gelman-Rubin 
indicated in the previous example, we determine convergence of our chains. This 
numerical evidence is presented in Figure [s] where, after 5 x 10^ iteration, both the 
max PRSF and MPRSF for the highest energy models converges to one. Uncorrelated 
samples (from independent chains) are shown in Figure [9] 

Samples of the posterior from our converged chains define our gold standard. 
These are then used to compare the performance of Gaussian approximations in terms 
of mean and variance. Analogous to the previous example, we use an ensemble with 
of size Ne = 50 and compute the mean and variance with LMAP, RML, EnKF, EnKF 
with localization, EnSRF and EnSRF with localization. Relative errors of the mean 
and variance (see expression ([62| ) ) with respect to the posterior distribution are pro- 
vided in Table |3] In Figure |10| and Figure [TT] we present the mean and variance, 
respectively. 

In contrast to the previous experiment, the error in the approximation given by 
N{umap^Cmap) is relatively large. The relative errors of mean and variance are 27% 
and 14% respectively. Similar to the previous experiment, RML provided the best 
approximation of the posterior in terms of the mean. However, the variance of the 
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posterior was significantly overestimated by RML. The performance of the standard 
EnKF for Nen = 50 was very poor. From Figure 10 we observe large values of the 
estimated field, which are typically found in standard EnKF applications with small 
ensemble sizes. Nevertheless, for the same size, covariance localization has a posi- 
tive effect by reducing the error in the mean and variance with respect to the pos- 
terior. Similar to the previous experiment, EnKF and EnKF with localization were 
outperformed by the corresponding EnSRF implementations. As in the previous ex- 
periment, EnSRF with localization provided, among the techniques with A/^^ = 50, the 
best approximation in terms of variance. 

Unlike the preceding experiment, here we observe that increasing the size of the 
ensemble does not result in a decrease of the error with respect to the mean. This 
can be observed from at the end of Table [3] where we report the results of EnKF 
implementations for Nen = 1000, Nen = 3000, Nen = 8000. Both in terms of mean and 
variance, EnKF for large ensembles does not exhibit convergence to the posterior. 
Note that EnKF with Nen = 3000 corresponds to the same computational cost of 
RML. Yet, the latter provides a better approximation in terms of the mean. Among 
all the techniques, LMAP provides reasonable approximations of both the mean and 
variance of the posterior. 

In order to assess the performance of the approximate posterior samples at re- 
producing the predicting distribution, we consider an additional simulation period 
of 5 years of forecast. For this additional 5 years, a new well labeled as Ps in Fig- 
ure [8] (middle) is drilled and operated under constant fixed bottom-hole pressure 
P^j^ = 2.7 X 10^ Pa. In Figure 12 we present the total flow rates (from Pi and Ps) 



and bottom-hole pressure (from /i) simulated with permeabilities from the prior (first 
row), the posterior (second row), and some of the Gaussian approximations under 
analysis (third- sixth row). The vertical line divides the assimilation from the predic- 
tion. The red curve is computed from the posterior mean at the corresponding loca- 
tion. We note that the poor performance of EnKF is reflected in the model predictions 
whose ensemble does not capture the prediction based on the mean of the posterior. 
These issues are alleviated by localization as we observe from Figure 12 (last row). 

In Figure 13 we display the distribution of the final time cumulative oil production 
simulated from the posterior and the Gaussian approximation. Note that even though 
LMAP provides a reasonable approximation in terms of both mean and variance, the 
approximation provides a deficient characterization of the predicting distribution. In 
general all the Gaussian approximations exhibit poor performance at reproducing the 
predicting distribution. 



5.3 Oil- water reservoir: Large number of wells 

In this subsection we consider again the oil-water reservoir model described in Sec- 
tion [2] The well configuration for this case is displayed in 14 (middle) and relevant 
information can be found in Table [T] The aim of this experiment is to evaluate the 
performance of Gaussian approximations when measurements from many wells are 
available. The prior distribution of log-permeability is defined by the Gaussian mea- 
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Table 3 Evaluation of Gaussian approximations for the two-phase model. Case with small number of 
wells. 



iVlcLllOQ 




Relative error 


Relative error 


Computational cost 






111 LllC lllCdll C-ii 


111 LllC Vallalll^C C(y 


Ll UlWcllU- liiULlCl 1 UllfJj 


MCMC 




0.000 


0.000 


5.5 X 10^ 


MAP 




0.277 


0.143 


6.0 X 10^ 


LMAP (Ne = 50) 




0.284 


0.286 


6.0 X 10^ 


RML (Ne = 50) 




0.253 


0.475 


3.0 X 10^ 


EnKF (Ne = 50) 




1.159 


0.424 


5.0 X 10^ 


EnKF (localization, Ng - 


^50) 


0.600 


0.263 


5.0 X 10^ 


EnSRF {Ne = 50) 




0.715 


0.397 


5.0 X 10^ 


EnSRF (localization, Ne 


= 50) 


0.483 


0.259 


5.0 X 10^ 


EnKF {Ne = 1000) 




0.353 


0.209 


1.0 X 10^ 


EnKF {Ne = 3000) 




0.301 


0.222 


3.0 X 10^ 


EnKF {Ne = 8000) 




0.337 


0.216 


8.0 X 10^ 
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Fig. 8 Two-phase model (small number of wells). Left: True log-permeability [logm^]. Middle: Well 
configuration. Right: Gelman-rubin diagnostic 




Fig. 9 Two-phase model (small number of wells). Samples from the posterior distribution (characterized 
with MCMC) [logm^] 
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Fig. 10 Two-phase model (small number of wells). Mean of the posterior distribution (characterized with 
MCMC) and Gaussian approximations [logm^] 



sure defined in ( 60 ) with the same tunable parameters used in the experiment of sub- 
section ISTT] 

Generation of synthetic data was conducted with the same procedure described 



before. The "true log-permeabiHty" u'^ is displayed in Figure 
error r] added to G{u'^) has the form of (63 ) with Nj 

r]^'^-A^(0,2.7x lO'^Pa), 
r7^'^-A^(0,0.06mVday), 



14 



9andA^P = 

je{h...,Nj}, 
je{h...,Np}. 



(left). The random 
= 16 and 



(65) 



for all ^ G {1, ... ,7}. Measurement times are = 0.467^ years, ^ = {1, . . . ,7} The 
pCN-MCMC Algorithm[T]is applied to generate 110 chains starting from independent 
draws from the prior distribution. After a burn-in period of 1.5 x 10^ the chains are 
run for 5 x 10^ iterations. The Gelman-Rubin diagnostic is conducted as describe 



in the preceding sections. Figure 14 (right) show the PRSF and MPRSF as defined 
previously. Uncorrelated samples (from independent chains) are shown in Figure 15 
Similar to the previous section, converged chains provide the posterior against 
which to compare the performance of Gaussian approximations in terms of mean 
and variance. The first part of TableH] provides the results when an ensemble with 
of size Ne = 50 is used. In Figure [T6[and Figure [17] we display mean and variance, 
respectively. 

Among all the ensemble methods with Ne = 50, RML provides the best approx- 
imation in terms of mean. Note that the approximation provided by N{umap^Cmap) 
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Fig. 11 Two-phase model (small number of wells). Variance of the posterior distribution (characterized 
with MCMC) and Gaussian approximations [(logm^)^] 



provides the best approximation in terms of combined mean and variance. Addition- 
ally, even with localization both EnKF and EnSRF provide a very poor approximation 
in terms of mean and variance. It is worth mentioning that RML and LMAP provided 
a better approximation (in terms of mean and variance) than the ensemble Kalman 



filter-type methods for Ne = 50. In Figure 18 we show the total flow rates (from P\ 
and P^) and bottom-hole pressure (from I\) simulated with permeabilities from the 
prior (first row), the posterior (second row), and some of the Gaussian approxima- 
tions under analysis (third- sixth row). The vertical line divides the assimilation from 
the prediction. In this case, prediction is performed by simulating an additional 3.5 
years under the same well configuration. The red curve is computed from the poste- 



rior mean at the corresponding location. In Figure [19] we display the distribution of 
the final time cumulative oil production simulated from the posterior and the Gaus- 
sian approximation. The poor performance of EnKF and EnSRF is reflected in the 
poor performance at characterizing the predicting distribution. 

As we mentioned earlier, limitations of the EnKF and EnSRF arise when large 
number of measurements are assimilated. In the present work we are interested in 
the associated detrimental effect on the approximation of the posterior distribution. 
In order to observe that effect, we now consider application of our Gaussian approxi- 
mation on a larger ensemble Ne = 250. These results are presented in the second part 
of Table |4] Note that for Ne = 250 the ratio of total number of wells to ensemble size 
is the same as in the previous experiment (A^^ = 50 and N^ = 5). For Nen = 250, Table 
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Fig. 12 Two-phase model (small number of wells). Total flow rates from Pi (left column), P5 (middle 
column) and bottom-hole pressure from /i (right column) simulated with permeabilities sampled from 
(top to bottom rows) the prior, the posterior, LMAP, RML, EnKF and EnKF with localization. 
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Fig. 13 Two-phase model (small number of wells). Distribution of cumulative oil production at wells 
Pi , . . . , P5 at the final time of simulation r = 10 years 



|4] indicates that RML provides again the best approximation in terms of mean. The 
performance of EnKF, EnSRF and their locaHzations are considerably improved with 
respect to the ones for A^^ = 50. As in previous examples, EnSRF with localization 
provides the best approximation in terms of variance. Also similarly to the previous 
experiments, increasing the size of the standard EnKF does not improve the approx- 
imation in terms of variance. On the other hand, in this case the variance increases 
with the size of ensemble. This can be observed from the last part of Table |4] where 
EnKF was implemented for Nen = 1000, Men = 3000, Nen = 6000 and Men = 18500. 
Note that the computational cost of EnKF for Nen = 18500 coincides with the cost of 
our implementation of RML with Nen = 50. 
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Table 4 Evaluation of Gaussian approximations for the two-phase model. Case with large number of 
wells. 



Method 


Relative error 


Relative error 


Computational cost 




in the mean £^ 


in the variance £0 


[Forward model runs] 


MCMC 


0.000 


0.000 


5.5 X 10^ 


MAP 


0.131 


0.165 


3.5 X 10^ 


LMAP {Ne = 50) 


0.179 


0.287 


3.5 X 102 


RML (Ne = 50) 


0.169 


0.307 


1.85 X 10"^ 


EnKF (Ne = 50) 


0.932 


0.816 


5.0 X 10^ 


EnKF (localization, Ne = 50) 


0.635 


0.616 


5.0 X 10^ 


EnSRF {Ne = 50) 


0.862 


0.658 


5.0 X 10^ 


EnSRF (localization, A^^ = 50) 


0.539 


0.471 


5.0 xlO^ 


LMAP {Ne = 250) 


0.146 


0.190 


3.5 X 10^ 


RML {Ne = 250) 


0.121 


0.231 


9.25 X 10^ 


EnKF {Ne = 250) 


0.434 


0.166 


2.5 X 10^ 


EnKF (localization, A^^ = 250) 


0.304 


0.113 


2.5 X 10^ 


EnSRF {Ne = 250) 


0.371 


0.110 


2.5 X 10^ 


EnSRF (locaHzation, Ne = 250) 


0.285 


0.101 


2.5 X 10^ 


EnKF {Ne = 1000) 


0.243 


0.101 


1.0 X 10^ 


EnKF {Ne = 3000) 


0.161 


0.137 


3.0 X 10^ 


EnKF {Ne = 6000) 


0.127 


0.148 


6.0 X 10^ 


EnKF (large iVe = 18500) 


0.111 


0.154 


1.85 X 10^ 



True log-permeability Well locations 




Fig. 14 Two-phase model (large number of wells). Left: True log-permeability [logm^]. Middle: Well 
configuration. Right: Gelman-rubin diagnostic 




Fig. 15 Two-phase model (small number of wells). Samples from the posterior distribution (characterized 
withMCMC) [logm^] 
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Fig. 16 Two-phase model (large number of wells). Mean of the posterior distribution (characterized with 
MCMC) and Gaussian approximations [logm^] 
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Fig. 17 Two-phase model (large number of wells). Variance of the posterior distribution (characterized 
with MCMC) and Gaussian approximations [(logm^)^] 
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Fig. 18 Two-phase model (large number of wells). Total flow rates from Pi (left column), P5 (middle 
column) and bottom-hole pressure from /i (right column) simulated with permeabilities sampled from 
(top to bottom rows) the prior, the posterior, LMAP, RML, EnKF and EnKF with localization. 
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Fig. 19 Two-phase model (large number of wells). Distribution of cumulative oil production at wells 
P7, Pi 0,^1 2, ^14, ^16 at the final time of simulation f = 10 years 
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6 Conclusions 

The controlled experiments from the preceding sections enable us to numerically as- 
sess the performance of widely used ad hoc Gaussian approximations with respect 
to their ability to correctly reproduce the mean and variance under the true Bayesian 
posterior distribution. The true posterior is obtained by use of the expensive, but 
accurate, gold-standard preconditioned Crank-Nicolson Markov Chain Monte Carlo 
(pCN-MCMC) method. The forward operators associated to the reservoir models 
under consideration are nonlinear. Therefore, even though a Gaussian prior is con- 
sidered, the posterior distribution associated to each experiment is non-Gaussian. In- 
deed, for all our experiments a significant discrepancy between the maximum a pos- 
teriori (MAP) estimate and the mean of the posterior distribution was obtained giving 
clear numerical evidence that the posteriors for the Bayesian data assimilation prob- 
lems under consideration are not Gaussian. Even in the single-phase reservoir whose 
associated forward operator is "less" nonlinear (the PDE ([T]) is linear with respect to 
p), we find that the relative error in the MAP estimator with respect to the mean of the 
posterior was 3%. For the two-phase reservoirs, this relative error was 27% and 13% 
in the case of small and large number of wells, respectively. Thus the problems that 
we study demonstrate a range of deviations from Gaussianity in the posterior. This 
makes them a suitable range of test problems for the ad hoc algorithms, all of which 
can be systematically derived in the linear Gaussian scenario, but whose accuracy in 
the non-Gaussian case is, in general, unclear. 

We clearly observe substantial differences in the approximation properties of the 
posterior distribution with respect to the choice of method, reservoir model and well 
configuration. For all the experiments conducted here we conclude that, among all the 
Gaussian approximations under analysis, the linearization around the MAP (LMAP) 
is arguably the best technique at reproducing the posterior distribution in terms of 
combined variance and mean. It is interesting to speculate why this might be so, 
and what it tells us about the posterior. The LMAP algorithm is the only Gaussian 
approximation which samples from N{umap^Cmap) where umap is the MAP estimate 
and Cmap the associated covariance matrix defined by ([24]) and ( [25] ), respectively. 
This suggests that, out of all the Gaussian approximations considered, the posterior 
distribution in all our examples can be best approximated, in terms of mean and 
variance, by N{umap^Cmap)- We emphasize that this does not imply that the posterior 
distribution is Gaussian. Indeed, although this may be the best approximation, errors 
may still be large. 

We recall that all the techniques described in Section [4] produce samples of the 
posterior distribution in the linear-Gaussian case. In other words, they sample from 
the exact posterior distribution N{umapiCmap)- In our experiments, however, we ob- 
serve clear differences in the approximations obtained with each of the techniques 
under consideration. For example, note from all our experiments that the random- 
ized maximum likelihood (RML) provided the best approximation of the posterior in 
terms of the mean. In addition, in the case of single-phase, RML provides a reason- 
able approximation of the posterior variance (like the one obtained with LMAP). It 
is worth mentioning that favorable RML results for single-phase reservoirs are also 
reported in flTl . In contrast, for the two-phase model we find examples where the 
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error of the RML variance is the largest compared to other Gaussian approximations. 
These observations are Hkely to be related to the higher nonlinearity in the two-phase 
forward operator ^Tf) that results from the nonlinear PDE system (|7])-([8]). Due to the 
aforementioned higher nonlinearity, large changes in the absolute values of the log 
permeability field may not necessarily correspond to large changes in the production 
data. In fact, production data may typically have smaller sensitivity to the permeabil- 
ity values far from (or in-between) the well locations. On the other hand, we recall 
from the RML algorithm, that each ensemble member u^^l^^ (see equation pro- 
duces a model output Giu'^j^j) that is close to the perturbed data y^^^ while keeping 

^//ml close to the corresponding sample from the prior u^j\ Due to the aforemen- 
tioned small sensitivity of production data to the log-permeability in some regions 
of the domain, it may be possible that the penalty term | |t/^^2fL ~ ^^^^ 1 1 I^t] ) may 
not provide sufficient constraint to avoid possible large values of | u'Im^ \ in the afore- 
mentioned regions for which the (perturbed) production data is minimally affected 
by large value of permeability. Although in our controlled experiment LMAP out- 
performed the RML in terms of combined mean and variance, it has been reported 
that RML has the advantage of approximating multimodal distribution for which 
N{umaPjCmap) and therefore LMAP is suboptimal. The assessment of techniques 
where multimodal posterior distribution arises deserves further investigation; how- 
ever it has not formed part of our studies which have been confined to problems with 
unimodal posteriors. Moreover, we recall that the computational cost of RML can 
be amortized if each ensemble member is computed in parallel. Therefore, the cost 
of the parallel implementation of RML equals the cost of LMAP. Note also that, for 
very large problems, the factorization of Cmap used in ([26]) may be computationally 
prohibitive while the covariance of RML is computed directly from the ensemble at 
a negligible cost. 

For each of our experiments, very poor approximations of the posterior distri- 
bution are obtained with the ensemble Kalman filter (EnKF) with a small ensemble 
size. However, covariance localization leads to a significant reduction in the rela- 
tive error of the mean and variance with respect to the posterior. Note for example 
that, in the two-phase model with small number of wells, localization reduces the 
relative error in the mean and the variance by a factor of two. Additionally, the en- 
semble Kalman smoother (EnSRF) provides better approximations of the posterior 
(in terms of mean and variance) than the ones obtained with EnKF. Furthermore, in 
all our experiments, the ensemble generated with EnSRF with localization provides 
the best approximation of the posterior in terms of variance. The advantage of using 
covariance localization as well as using EnSRF instead of EnKF has been widely in- 
vestigated in terms of reconstructing the truth and/or recovering the truth within the 
confidence intervals provided by the ensemble approximations. Our results offer now 
numerical evidence of the advantage of using covariance localization and square root 
filters for reconstructing the posterior distribution. The choice of covariance localiza- 
tion that provides optimal approximation of the posterior distribution must be further 
investigated. 

Reducing the detrimental effect of sampling error due to the small ensemble size 
and the possible large amount data is essential in practical applications where a small 
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number of ensemble members is required to avoid high computational cost in data 
assimilation. Nonetheless, our results indicate that even for a large ensemble size 
where presumably sampling error issues are attenuated, we find that EnKF does not 
converge to the posterior distribution. In fact, as the ensemble sizes increased, the 
converged Gaussian approximation provided by EnKF resulted in errors of at least 
10% both in mean and variance. In addition, the approximations provided with those 
large size ensembles do not coincide with the approximations provided by either 
LMAP and RML. 

In summary, our study sheds light on various aspects of the ad hoc Gaussian 
approximate filters used in practice to approximate high dimensional posterior distri- 
butions on geological reservoir properties. The study has been made possible by use 
of a fully resolved gold-standard MCMC computation which allows for a clear and 
well-founded evaluation of the ad hoc algorithms. In our opinion more evaluations 
of this kind will be beneficial in guiding the future evolution of the ad hoc Gaussian 
approximate filters that are so widely used in practice. 
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